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Determining oil migration distances from source rocks to reservoirs can greatly help in the search for new 
petroleum accumulations. Concentrations and ratios of polar organic compounds are known to change due 
to preferential sorption of these compounds in migrating oils onto immobile mineral surfaces. However, 
these compounds cannot be directly used as proxies for oil migration distances because of the influence of 
source variability. Here we show that for each source fades, the ratio of the concentration of a select polar 
organic compound to its initial concentration at a reference point is independent of source variability and 
correlates solely with migration distance from source rock to reservoir. Case studies serve to demonstrate 
that this new index provides a valid solution for determining source-reservoir distance and could lead to 
many applications in fundamental and applied petroleum geoscience studies. 

Petroleum is generated via thermal alteration of buried organic matter in source rocks, followed by oil 
expulsion (primary migration) out of those source rocks. Petroleum accumulations are formed mainly 
via subsequent secondary migration from source rocks through carrier beds to traps. Information about the 
directions, pathways and distances of secondary petroleum migration is required in the search for new petroleum 
resources. However, secondary petroleum migration still remains the least understood of the processes involved 
in petroleum accumulation. Because fractionation of polar organic molecules can result from preferential sorp- 
tion of these compounds in migrating oils onto immobile mineral surfaces or through partitioning into formation 
water, molecular indices that are correlated solely with the absolute or relative migration distances migrated by 
oils have been sought for decades 1,2 , but with limited success. 

Nitrogen-, sulfur- and oxygen- containing compounds exhibit strong sorption on minerals and/or high solu- 
bility in water due to their polarities, and thus variations in the distribution of these molecules are used to study oil 
migration processes 3,4 . However, those compounds with high solubility in water such as alkylphenols can be easily 
affected by water saturation, water washing and injection for enhancement of oil production, and are sensitive to 
any oil-water interactions in the subsurface environment 5 . Concentrations and ratios of carbazoles (nitrogen- 
containing compounds) were previously used as proxies of secondary migration distances 2,614 , based on their low 
solubility in water 15 . However, recent studies show that these empirical indicators do not solely reflect migration- 
related fractionations and thus do not actually correlate with migration distance, because their concentrations 
and ratios can also be affected by variations in organic facies (such as marine, lacustrine, or terrigenous organics; 
anoxic or suboxic depositional environment; carbonate or shale lithology) and thermal maturation of source 
rocks as well as biodegradation of oils 8,10,16 20 . In addition, it appears that properties of migration systems, such as 
porosity, sorption coefficients, oil saturation and oil volume, may also influence the utility of these tracers 4 . 

Among these influences, the biodegradation effect on carbazoles is negligible when biodegradation levels are 
less than 3 on the scale of Peters and Moldowan (1993) 20,21 . On the other hand, source input influences due to 
variations in source facies and maturity of organic matter, a parameter related to the maximum temperature 
experienced by source rocks at the time of oil expulsion, are significant and thus cannot be ignored 8,10,16,17,19,22 . The 
influence due to the variability of source facies can be minimized by grouping oils according to their source 
facies 22 . However, the maturity effect has been difficult to evaluate and impeded studies of secondary petroleum 
migration. 

It is difficult, if not impossible, to find an oil component in nature that is independent of source input 
influences. Nevertheless, it is feasible to set up a secondary migration fractionation index (SMFI) that is 
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independent of source input influences, reflects only migration- 
related fractionation and thus correlates directly with migration 
distance. Here, we advance the concept of SMFI as a reliable measure 
of migration fractionation and migration distances for a uniform 
migration system, where porosity, density of solids, sorption coeffi- 
cients, migration velocity of oil and oil saturation are kept constant. 
More realistic migration systems with variable properties could be 
treated by dividing them into subsections with constant properties. 
The SMFI is defined as the ratio of the concentration of a large polar 
compound (heavier than 160 Dalton) with low concentration (e.g. 
carbazoles) to its initial concentration at a reference point for each 
source facies. In other words, the concentration of the large polar 
compound is actually a product of the initial concentration (con- 
trolled by source input influences) and the index that characterizes 
fractionation solely with secondary migration distance (see Equa- 
tions (1, 4 and 5) in the methods section). Oil volume passing 
through a carrier bed or multiple charging events do not affect the 
validity of the ratio, when appropriate compounds with very low 
concentration in petroleum are selected as tracers (see for details 
in the multiple charging and oil volume section in the online 
Supplementary Information). This new SMFI is fully described in 
the methods section, and mathematically derived in the Supplemen- 
tary Information based on the mass balance principle and advection- 
reaction-dispersion theory. 

We then apply and test the validity of the new index in the Ordos 
and Western Canada Sedimentary basins where both concentrations 
and ratios of carbazoles do not effectively reflect migration-related 
fractionations and thus the distances of secondary petroleum migra- 
tion. The narrow, long, continuous and clay-rich sand body of the 
Xifeng Oilfield in the Ordos Basin provides an excellent opportunity to 
test if our index can work well. The Rimbey-Meadowbrook reef trend 
in the Western Canada Sedimentary Basin is a classical example used 
to develop the Gussow theory 23 of differential petroleum entrapment 



involving long distance migration along the reef trend in the up-dip 
direction. However, this theory is still being debated mainly because 
the empirical indicators do not show obvious migration fractionations 
for most oils along the reef trend. We demonstrate that our SMFI fits 
the actual data well and is a reliable odometer for the distance of 
secondary petroleum migration, and we provide supporting evidence 
for the Gussow theory. The new index represents a significant step 
forward in petroleum geoscience as it can be used to reveal the distri- 
bution patterns of petroleum accumulations in sedimentary basins and 
to study theories on petroleum accumulation. 

Results 

We first tested the utility of the SMFI in the Xifeng Oilfield in the 
southwest part of the Ordos Basin in China (Fig. la). Much of 
the basin lacks well-developed fault systems except in areas along 
the basin margins 24 . The reservoirs of this field are distributed in a 
narrow, long, continuous and clay-rich sand-body in the Eighth 
member of the Upper Triassic Yanchang Formation with low por- 
osity (5.4-16.6%, 9.9% on average) and low permeability (0.1-36.9 
millidarcy (mD), mainly 0.6-3.0 mD) 25,26 . The sand bodies in this 
member in the southwest part of the basin (Fig. 1) were deposited in 
the delta front of a braided river system 27 . The sedimentary micro- 
facies of these sand bodies include submerged distributary channels 
and mouth bars 28 . The sand bodies have undergone various stages of 
diagenesis 29 , leading to low porosity and permeability of reservoirs in 
the field. The main source rocks are dark oil shales in the Seventh 
member of the Yanchang Formation 30 , with the peak oil generation 
and migration occurring at the end of the early Cretaceous 31 . The 
source kitchen is mainly distributed to the northeast of the Xifeng 
Oilfield (Fig. lb). 

Nineteen crude oil samples were collected from the producing 
wells in the Xifeng Oilfield (Fig. lb). These samples were analyzed 
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Figure 1 | Distributions of oilfields and the main source rocks in the Ordos Basin, China, (a) Structural contour map for the top of the Eighth member 
of the Upper Triassic Yanchang Formation (modified from Chen et al, 2006 and Zhang et al., 2009) 45,46 and the oilfield distribution in the basin. The 
contours are in meters below and above sea level as indicated by plus and minus signs in front of numbers, (b) Distributions of source rocks in the 
Seventh member of the formation (thickness in meters) and sand bodies in the Eighth member in the southwest part of the basin (modified from Yang and 
Zhang, 2005)™, together with sampling locations. 
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for saturated and aromatic hydrocarbons and carbazoles, following 
the procedures described previously 732 (Data are presented in 
Supplementary Tables S1-S4). The relative uncertainties of the data 
are typically < 10%. 

These data show that oils in the Xifeng Oilfield have consistent 
source facies (see Supplementary Figs. S1-S2 and relevant discussion 
in the Supplementary Information) and thus there is no need to 
separate these oils into genetic groups. The extent of biodegradation 
in the Xifeng Oilfield is below level 1 on the biodegradation scale of 
Peters and Moldowan 21 and thus the biodegradation effect on carba- 
zoles can be safely ignored. This is further demonstrated in details in 
the Supplementary Information (the biodegradation level subsection). 

To calculate the relative migration distance, a reference point is 
needed (see details in the methods section). The determined ref- 
erence point for the Xifeng Oilfied is located at the northeastern edge 
of the sand body and is close to the source kitchen, shown in Fig. lb. 
The relative distance is defined as the length of the trend line of the 
sand body from the reference point to the projected point of a sam- 
pling well on the trend line. The relative distance was calculated for 
all samples in this manner and listed in Supplementary Table SI. 

Many maturity parameters vary along the sand body of the Xifeng 
Oilfield (Supplementary Figs. S3-S4). Vitrinite reflectance (Ro) is a 
commonly used thermal maturity indicator of organic matter in 
source rocks. Maturity levels of oils are quantitatively determined 
as a vitrinite reflectance equivalent CRo(equiv.)). To constrain the 
thermal maturity range of the studied oils, _Ro(equiv.) values were 
calculated using aromatic hydrocarbons 3334 as detailed in the thermal 
maturity subsection of the Supplementary Information. The calcu- 
lated _Ro(equiv.) values (_Ro(equiv.) = 0.14(4,6-DMDBT/1,4- 
DMDBT) + 0.57; DMDBT = dimethyl dibenzothiophene) 34 are in 
a narrow range of 0.69% to 0.77% and show a clear decreasing trend 
with increasing distance throughout the oilfield (Fig. 2). 

The carbazoles of the studied oils display a predominance of alkyl- 
carbazoles over benzocarbazoles. The ratios of alkylcarbazoles/ 
(alkyl- + benzocarbazoles) are all close to unity (Supplementary 
Table S4). Therefore, we focused on alkylcarbazoles. The concentra- 
tions of alkylcarbazoles decrease with increasing migration distance 
(Figs. 3a-f) and were thought to reflect secondary petroleum migra- 
tion 6-14 . If so, their ratios should also be distance indicators. However, 
as shown in Fig. 3g-l, this expectation is not supported by the ratios 
of N-H exposed/partially exposed, of exposed/shielded and of par- 
tially exposed/shielded dimethylcarbazole isomers. Clearly, other 
factors are involved and must be teased out before we can use these 
tracers to track secondary migration distance. The differences in 
chemical sorption activity of alkylcarbazole isomers for hydrogen 
bond formation arise mainly from steric effects related to alkylation 
position 35 . The sorption of N-H exposed alkylcarbazole isomers (e.g., 
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Figure 2 | Distribution of equivalent vitrinite reflectance (%Ro) 
calculated from 4,6-DMDBT/l,4-DMDBT ratio along the sand body of 
the Xifeng Oilfield. Ro(equiv.) = 0.14(4,6-DMDBT/1,4-DMDBT) + 0.57 
(Luo et al., 2001 ) 34 . DMDBT = dimethyl dibenzothiophene. 



2,7-dimethylcarbazole) is stronger than that of alkylcarbazole iso- 
mers with partially exposed N-H (e.g., 1,7-dimethylcarbazole). The 
sorption of partially exposed isomers is stronger than that of N-H 
shielded alkylcarbazole isomers (e.g., 1,8-dimethylcarbazole) 735 39 . 
Owing to this shielding effect, the ratios in Figs. 3g-l should decrease 
with increasing relative migration distance if fractionations without 
source input influences occurred during secondary petroleum migra- 
tion. But, this decreasing trend is not evident in the data, likely due to 
the influence of maturity variations of these oils on alkylcarbazoles in 
this oilfield. 

To isolate maturity influence, we set up a maturity influence index 
to quantitatively evaluate the maturity effect (Equation (3) in the 
methods section). The values of maturity influence index are calcu- 
lated from the derivative of Ro(dRo /dx), where x is the relative migra- 
tion distance, and the constants of a 2 and a 3 , where a 2 is a rate of 
change of initial concentration with Ro and a 3 is related to sorption 
and oil migration velocity (Equations (S5, S8 and S20) in the online 
Supplementary Information). Note that a\ cancels out in Equation 
(3). Although the constants a 2 and a 3 have fixed geochemical mean- 
ings, their values are determined by the properties of a specific migra- 
tion system represented by the geochemical data of the system. To 
determine the values of a 2 ana a 3> non-linear regression analysis was 
performed using Equation (1) in the methods section, and using the 
values of relative distance in Supplementary Table SI, the .Ro(equiv.) 
values in Supplementary Table S3, and the concentrations of alkyl- 
carbazoles in Supplementary Table S4 as input data. The results are 
listed in Supplementary Table S5. The a 2 values vary from <0 to >50. 
The a 2 values less than zero indicate that the concentrations of these 
alkylcarbazoles in the studied oils decrease with maturity. Studies of 
source rocks have also revealed that benzocarbazole concentrations 
decrease with maturity over the similar maturity range of 0.68% to 
0.78% in Ro i6 . This may represent a stage of dilution of some carba- 
zoles due to a preferential increase of other components. 

The sand body of the Xifeng Oilfield can be divided into two 
sections according to dRo/dx (Fig. 2). The dRo/dx value in the 
section from 62 to 90 km is very low (Fig. 2), resulting in small values 
of maturity influence index (<5%, in Supplementary Table S5). In 
Figs. 3g-i and k-1, ratios of alkylcarbazoles in the section between 62- 
90 km shows a weak decreasing trend with increasing relative migra- 
tion distance. In the section from 51 to 62 km, however, the values of 
maturity influence index reach up to 13.3-51.1% (Table S5) and yet 
fractionation of alkylcarbazoles is not apparent (Figs. 3g-l). These 
observations illustrate that when the maturity influence index is 
<5%, the maturity influence is not evident. But, if it is S5%, the 
maturity influence needs to be addressed. 

Because the information of maturity influence is carried by the 
initial concentration (see the methods section for details), the matur- 
ity influence is accommodated by using the SMFI (Equation (4) in 
the methods section). Therefore, the SMFI of alkylcarbazoles is inde- 
pendent of maturity influence and offers an effective opportunity to 
assess the migration fractionation along the sand body (Equation (5) 
in the methods section). Figs. 4a-d show the relationships between 
SMFIs of individual carbazoles and relative migration distances. The 
samples in the section from 51 to 62 km are close to the regression 
lines and these regression lines all pass through the model value 
within analytical uncertainty, which equals to 100% at the reference 
point. 

Sums and ratios of concentrations of carbazoles, often used in the 
literature, cannot be directly used as migration indicators. This is 
obvious from Equation (1), where each compound has its own coef- 
ficients of «i, a 2 and a 3 and Ro varies. Geometric means of SMFI 
values for different types of dimethyl carbazoles, on the other hand, 
can be used as tracers for secondary migration distance as shown by 
Equation (6) in the methods section. As shown in Figs. 4e and f, the 
geometric means of SMFIs are strongly correlated with relative 
migration distances. The absolute values of a 3 in Equation (1) in 
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Figure 3 | Distributions of MCA, DMCA, EDMCA, PEDMCA, SDMCA and their ratios of the studied oils along the sand body of the Xifeng Oilfield. 

MCA: methyl carbazoles; DMCA: dimethyl carbazoles; EDMCA, PEDMCA and SDMCA: exposed, partially exposed and shielded DMCA; SDMCA: 
1,8-DMCA; 2,7/1,4-DMCA: 2,7-DMCA/l,4-DMCA; 2,7/1,8-DMCA: 2,7-DMCA/l,8-DMCA; 1,7/1,8-DMCA: 1,7-DMCA/1,8-DMCA. 



the methods section, which are inversely proportional to the absolute 
values of migration velocity of these carbazoles (Supplementary 
Equation (S20)), can be used to reflect migration fractionations of 
dimethylcarbazoles (DMCA). The regression equations in Figs. 4e, 4f 
and 4d show that the absolute value of a 3 of N-H exposed DMCA 
(0.046 km" 1 ) > partially exposed DMCA (0.036 km" 1 ) > shielded 
DMCA (i.e.l,8-DMCA) (0.028 km" 1 ). The migration sequence 
inferred is that shielded DMCA migrated faster than partially 
exposed DMCA, and partially exposed DMCA faster than exposed 
DMCA. This sequence corresponds to the retardation differences of 
dimethylcarbazoles determined by their respective sorption coeffi- 
cients arising from steric effects related to the alkylation position 35 
(Supplementary Equations (S5, S8 and S20)). 

We further show that the ratios of geometric means of SMFIs of 
N-H exposed/partially exposed, exposed/shielded, and partially 
exposed/shielded dimethylcarbazoles, as well as the corresponding 
SMFI ratios of individual dimethylcarbazoles can also serve as 
odometers for secondary petroleum migration, based on Equations 
(7 and 8) in the methods section. In Figs. 4g-l, these ratios all 
decrease with migration distance and their regression lines all pass 



through the model value of 1 at the reference point within analytical 
uncertainty. 

We have thus demonstrated that: (1) the SMFI fits the real data; (2) 
the higher the sorption coefficients of molecules are, the slower 
migration velocities, leading to the more evident fractionations; 
and (3) the petroleum in the Xifeng Oilfield migrated along the sand 
body from the source kitchen into the field in the SW direction 
(Fig. lb). 

We now further apply the SMFI concept to oils in the carbonate 
reservoirs in the Rimbey-Meadowbrook reef trend in the Western 
Canada Sedimentary Basin to evaluate its validity. As mentioned in 
the introduction, the Gussow theory 23 was derived from this trend 
but the convincing evidence for long distance migration along the 
trend has not yet been achieved. The Ro (equiv.) values of the oils 
along this trend vary from 0.68% to 0.86% (Supplementary Table S6). 
Sorption capabilities of carbazoles on minerals in carbonate reser- 
voirs are very low compared to clastic reservoirs 1 ". Therefore, ben- 
zocarbazoles were examined in these oils, as they are more easily 
adsorbed than alkylcarbazoles 7 . The results show that the maturity 
influence index of benzocarbazoles in the reef trend can reach 85.8% 
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Figure 4 | Correlations showing inferred relative migration distances from SMFIs of alkylcarbazoles, the geometric SMFI means of different kinds of 
dimethyl carbazoles and their ratios in the Xifeng Oilfield. SMFI: secondary migration fractionation index; MCA: methylcarbazoles; DMCA: 
dimethylcarbazoles; EDMCA, PEDMCA and SDMCA: exposed, partially exposed and shielded DMCA; SDMCA: 1,8-DMCA; GM(EDMCA), 
GM(PEDMCA) and CM (SDMCA): geometric means of SMFIs of EDMCA, PEDMCA and SDMCA, respectively; 2,7/1,4-DMCA SMFI: ratio of SMFI of 
2,7-DMCA to SMFI of 1,4-DMCA (4j); 2,7/1,8-DMCA SMFI: ratio of SMFI of 2,7-DMCA to SMFI of 1,8-DMCA (4k); 1,7/1,8-DMCA SMFI: ratio of 
SMFI of 1,7-DMCA to SMFI of 1,8-DMCA (41). The SMFI value of 100 (%) and SMFI ratio of 1 at the reference point (x = 0 km) are the model values. All 
the regression lines were obtained by only using the actually data points without forcing through the reference point. Therefore, they are derived only from 
the data. 



(Supplementary Table S7). The SMFIs and the ratio between SMFIs 
of benzocarbazoles, computed from the data in Supplementary Table 
S6, clearly show fractionations consistent with long distance migra- 
tion along the reef trend in the up-dip direction with remarkably high 
correlation coefficients (Supplementary Figs. S5 and S6), providing 
basic evidence for the Gussow theory 23 . This is in good agreement 
with the results of oil-source correlation studies that include maturity 
information 8,9 . The various lines of evidence suggest that the Gussow 
theory is generally applicable. Further details are discussed in the 
Supplementary Information. 

Discussion 

Carbazoles not only have stronger sorption capabilities than nonpo- 
lar compounds but also hold information about their source inputs 
including source fades and maturity variations. Our study shows 



that small maturity variations of less than 0.2% in Ro (vitrinite 
reflectance) can contribute to over 50% of the concentration varia- 
tions of alkyl- and benzocarbazoles. Given that the bulk of petroleum 
generation/expulsion occurs over the maturity range of 0.6% to 1.0% 
in Ro (ref. 2), the concentrations and ratios of carbazoles cannot be 
used directly as proxies for secondary petroleum migration distance 
in most basins where there exist significant influences of source 
variability. The secondary migration fractionation index, established 
in this paper, offers an effective solution to this problem and can 
serve as a distance indicator for secondary migration, as it eliminates 
the source maturity effect on oils grouped according to source facies 
and only reflects migration fractionation. This approach can be 
applied to other low concentration, large polar compounds with 
different sorption coefficients between isomers, although it is shown 
in this study for alkyl- and benzocarbazoles. 
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The ability of our index to reliably monitor secondary migration 
distances may lead to many applications in fundamental and applied 
petroleum geoscience studies. The index outlined here is a step 
towards correctly interpreting the behavior of low concentration, 
polar organic compounds in petroleum and thus it can help resolve 
many important questions in organic geochemistry and petroleum 
geology. Moreover, secondary petroleum migration in many basins 
around the world is poorly understood and yet the information about 
this process is most important for petroleum exploration 28 . Our 
index provides a new tool that can aid in the discovery of new 
resources via accurate assessment of the directions, pathways and 
distances of petroleum migration. The method established for cal- 
culation of the SMFI in this paper may or may not be universally 
applicable to oil accumulations with other than a simple linear geo- 
metry. The method for complex petroleum migration systems is the 
subject of future investigation. 

Methods 

Knowing the direction, pathway and distance of lateral secondary migration is 
essential in searching for new petroleum accumulations. In the following we develop a 
new methodology to track the distances of lateral secondary migration of oil through 
porous strata (such as sand bodies) or unconformities (erosional or non-depositional 
surfaces separating two strata of different ages). 

From the mass balance principle, a general advection- reaction -dispersion equa- 
tion 40,41 (Supplementary Equation (Si)) can be established for secondary petroleum 
migration in a uni-dimensional pathway. The properties and types of pathways were 
studied by Yang et al. (2005) 4 . To investigate the source input influence, we focus here 
on a uniform migration system, in which the properties of the system, including 
porosity, density of solids, sorption coefficients, migration velocity of oil and oil 
saturation, are constant. 

The general advection- reaction- dispersion equation can be simplified under the 
conditions below. When large polar compounds such as carbazoles are selected for a 
secondary migration study, molecular diffusion is insignificant 42 and can be safely 
neglected 4 . Lateral migration is very slow, especially in cratonic basins such as the 
Ordos Basin. Precisely because of slow migration, the effect of mechanical dispersion 
(caused by differences in microscopic migration velocities on a pore scale) is smaller 
than that of molecular diffusion and thus can be omitted 40 . Therefore, dispersion 
including molecular diffusion and mechanical dispersion can be neglected (see dis- 
cussions in paragraphs following Supplementary Equation (Si)). Partitioning 
between oil and water is neglected because adsorbable compounds with low solubi- 
lities in water must be selected for a secondary migration study. Secondary petroleum 
migration in carrier beds in the up-dip direction results in decreases in temperature 
and thus holds back or slows down the thermal evolution of oils if the basin does not 
subside substantially. In this scenario, we assume that only sorption occurs during 
secondary migration, to reveal migration fractionation. Thus, the general advection- 
reaction- dispersion equation reduces to advection -sorption equation 
(Supplementary Equation (S4)). 

Sorption of carbazoles in migration systems can approach equilibrium on geo- 
logical time scales 4 ' 43 . In sorption equilibrium theory, the linear isotherm model 
(Supplementary Equation (S5)) is valid for the natural systems where concentrations 
of adsorbable compounds are low 44 . From the advection- sorption equation and linear 
isotherm model, we can derive the source-dependent migration model that describes 
how the concentration of a carbazole (or any other large, polar compounds present at 
trace concentration levels) varies with maturity and migration distances for any given 
type of source facies (see detailed deduction process from Supplementary Equations 
(Sl)to(S21)): 

C(x,t) = Co(t)e a3X =ai(l-\-a 2 RoY 3X (1) 

where Co(f) is the initial concentration of a carbazole at the filling point (i.e. starting 
point of secondary petroleum migration); Ro (vitrinite reflectance) as a maturity 
variable is a function of time for a given type of source facies; both vitrinite reflectance 
(Ro) and its equivalent (Ro (equiv.)) quantitatively indicate the maturity levels with 
the same units, and thus are represented by one variable {Ro) in the equations; C(x,f) 
and Co(t) can also be expressed as C(x,Ro) and Cq(Ro), respectively {refer to 
Supplementary Equations (S21 and S22)); a\, a 2 and # 3 are constants, which can be 
determined through non-linear regression analysis of Equation (1). The parameter a\ 
is dictated by geochemical processes of hydrocarbon generation and fractionations in 
primary migration or migration before the reference point as defined in the next 
paragraph. It has units of concentration (ug/g). a 2 is a rate of change of initial 
concentration with Ro, defined in Supplementary Equation (S12). It is a dimen- 
sionless constant. If a 2 > 0, C 0 (f) increases with Ro; a 2 < 0, Q>(f) decreases with Ro. 
Oils from different source facies will have different values of the constants a\ and a 2 in 
Equation (1), reflecting different source input influences due to maturity variations 
among different source facies. The parameter 133 is proportional to the ratio of the 
retardation factor of an adsorbable compound to oil migration velocity or is inversely 
proportional to migration velocity of the compound (Supplementary Equation 
(S20)). It has units of km" 1 . The value of ci 3 is always negative when sorption occurs 



without any other reactions. Equation (1) indicates that low values of a$ (i.e. large 
absolute value) cause rapid concentration decreases of the large polar compounds 
with migration distance if only the sorption effect is considered. 

In the above, the filling point was used to define the absolute migration distance x 
in the theoretical analysis (see details in the model section of the Supplementary 
Information). However, because it is difficult to determine the filling point in practice, 
a reference point is often used to determine the relative migration distance, which 
usually is located behind the filling point in a pathway and thus results in a Ax 
representing the distance between the filling point and the reference point. 
Nonetheless, the relative migration distances can be directly used for estimating a x , a 2 
and a 3 (without any correction for Ax) via non-linear regression analysis of Equation 

(1) , as a 2 and a 3 do not change with Ax. Although a\ varies with Ax, the variable a\ 
does not affect the study of the relative migration distance as both are directly 
correlative. 

Equation ( 1 ) shows an exponential attenuation law style function with a variable 
initial concentration as a function of maturity. The derivation of Equation (1) is 
detailed from Supplementary Equations (Si) to (S22). This functional form is derived 
under the conditions of the linear isotherm sorption, very low dispersion and uniform 
pathways. 

The initial concentration Co(t) incorporates the source input information of a 
carbazole, its generation from a source rock and fractionation during primary 
migration (oil expulsion). Co(t) was shown to vary steadily with maturity (Ro) in the 
range of 0.45-1.3% (ref. 8), so that most of the variation can be described by a 
quadratic equation that becomes linear over a narrow Ro range such as 0.7-0.8% 
(Supplementary Table S3) in the Xifeng Oilfield (i.e. Qj(f) — fli (1 + a 2 Ro)) (see 
Equation (S12) and its relevant discussion in the Supplementary Information). 

As sorption equilibrium is achieved during secondary migration 43 and the thermal 
evolution of the oil either stops or slows down after expulsion, provided that the basin 
does not subside substantially, the present concentrations of a carbazole and 
£o(equiv.) values of oils can be used to represent C{x,Ro) and Ro values during 
secondary migration in Equation (1). 

Our model (Equation (1)) was derived for uniform migration systems. More 
realistic migration systems with variable properties could be treated by dividing them 
into subsections with constant properties. To ensure the model validity, proper 
compounds must be selected that should satisfy the requirements of sufficiently low 
concentrations in oil, low solubilities in water and strong enough sorption capacity 
(see for further details in the multiple charging and oil volume section of the 
Supplementary Information). With these compounds, our model can also be applied 
to carrier systems with multiple charging, which is demonstrated via linearization of 
the Langmuir isotherm model (Supplementary Equations (S27-S32)). The geo- 
chemical conditions for valid application of the model and the selected compounds 
are: (1) the thermal evolution of oils expelled from source rocks ceases or the oil 
migrates in the up-dip direction without substantial basin subsidence after expulsion; 

(2) the primary migration fractionation index is nearly a constant; (3) the relationship 
between Ro and the initial concentrations at the filling point or reference point is 
linear or can be described by a quadratic equation (see Supplementary Information 
for more details); and (4) oil biodegradation levels are < 1 on the biodegradation scale 
of Peters and Moldowan (1993 ) 21 or the effect of oil biodegradation is quantitatively 
removed. 

In a previous study, the quantitative models on factors influencing the distribution 
of phenol and carbazole compounds 4 did not address the issue of source input 
influences. In their pivotal model (Equation (33) in Yang et al. (2005) 4 ), the geotracer 
concentration during secondary migration is constant and the same as the initial 
concentration at the filling point. This model (Equation (33) in Yang et al. (2005) 4 ) 
can also be derived in our work as a special case (Supplementary Equation (S18)). In 
natural migration systems, however, the geotracer concentration during migration 
and initial concentration are all variable. Our new model (Equation (1)) is developed 
to address such complexities that are present in real systems. 

Quantitative evaluation of source input influences, including organic source facies 
and maturity, is a necessary first step in order to eliminate source input influences. 
Here we begin with the total differential of Equation (1) 

dC(x 1 Ro) = a 1 a 2 e a > x dRo + a l a 3 (l + a 2 Ro)e a3X dx (2) 

where a\a 2 e aiX dRo represents concentration variation of a carbazole caused by 
maturity variation for oils from a given type of source facies, and 
Si #3(1 + a 2 Ro)e a3X dx represents that caused by migration fractionation. The maturity 
influence index (Mil) for a given type of source facies is defined as 
\a\a 2 e aiX dRo\ 

MII= ■= , , 1 — x 100(%) 

\aia 2 e'^ x dRo\ + |eii<2 3 (l -\-a 2 Ro)^ x dx\ 





a 2 dRo 




(l + a 2 Ro) dx 


a 2 dRo 


+ l a 3| 


(i 


■f a 2 Ro) dx 



x 100(%) = 





d\n(l + a 2 Ro) 




dx 


iiln(l + a 2 Ro) 


+ |fl3| 


dx 



(3) 



x 100(%) 



The migration fractionation contribution index (MFCI) is equal to 10Q-MII (%), 
based on Equations (2 and 3). 

The maturity influence index quantitatively indicates the maturity influence in the 
source input information for a given type of source facies. Before using large polar 
compounds (e.g. carbazoles) to study secondary migration, the maturity influence 
index should be calculated to check whether the maturity influence is significant and 
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thus must be removed. The case studies (see the results section) illustrate that when 
the maturity influence index is >5%, the distribution of concentrations and ratios of 
large polar compounds (e.g. carbazoles) do not solely reflect migration distance and 
thus the maturity influence must be removed. 

To illustrate the net migration fractionation of a carbazole during secondary pet- 
roleum migration without maturity influence, we introduce the concept of a sec- 
ondary migration fractionation index {SMFI) for oils from a given type of source 
facies 



SMFI- 



C(x,Ro) 
Co(Ro) 



x 100(%) = 



C(x,Ro) 
fli(l -\-a 2 Ro) 



x 100(%) 



Substitution of Equation (1) into (4) yields 

SMFI — e asX x 100(%) 



(4) 



(5) 



Evidently, if a reference point is used instead of the filling point, Equations (3 and 5) 
are still applicable. Since the SMFI as defined above only reflects migration frac- 
tionation, it serves as an odometer for secondary migration in a uniform pathway. 
SMFI equals 100% at the reference point, which is defined as the model value. In the 
case of multi- source -facies, oils are first grouped according to their source facies. The 
ai, a 2 £*3 and SMFI are then estimated separately based on their respective source 
facies. This minimizes the influence arising from variations in source facies. 
From Equation (5), we can derive 

GM = e^ x x 100% (6) 

where GM is the geometric mean of SMFIs and 03 is the arithmetic mean of a 3 for one 
type of alkylcarbazoles. For example, GM(EDMCA) represents the geometric mean of 
SMFIs of N-H exposed dimethylcarbazoles (EDMCA). 
From Equation (6), we can get further 



GMi I GM 2 



(7) 



where GM\ and GM2 are the geometric means of SMFIs for two types of alkylcar- 
bazoles, respectively. The a\ indicates the arithmetic mean of a 3 of alkylcarbazoles of 
type one; a\, type two. 

Similarly, the ratios of SMFIs of different types of individual alkyl carbazoles can 
also help identify migration fractionation. From Equation (5), we can derive 

SMFh/SMFI 2 =e^- a > (8) 

where SMFIi is the secondary migration fractionation index of an alkylcarbazole of 
type one; SMFI 2 , type two. The a\ represents a 3 of an alkylcarbazole of type one; a\, 
type two. 

Evidently, SMFIs, SMFIi/SMFI 2 , GMs, and GMi/GM 2 are all functions of 
migration distance x, thus can all serve as odometers for secondary migration in a 
uniform pathway and can be used to identify migration fractionation and to further 
reveal migration directions or pathways. Both SMFIi jSMFI 2 and GM\ / GM 2 
decrease with migration distance when SMFI 2 and GM 2 are calculated from the 
compounds of the type with comparatively low sorption capacities or sorption 
coefficients. At the filling or reference point, GM equals 100%, and both 
SMFh /SMFI 2 and GM X / GM 2 equal 1 . 

The general approach of using our model and SMFI is summarized below: 

(1) Classify oils according to their source facies. For each type of source facies, 
conduct the following analyses; 

(2) Select a possible migration pathway and calculate the relative migration 
distance (as outlined in the results section for the distance calculation); con- 
duct non-linear regression analysis of Equation (1) with the data of the 
relative migration distance, concentrations of geotracers and Ro, to derive 
the constants of a\, a 2 and a 3 in Equation (1); 

(3) Conduct linear or polynomial regression analysis between Ro and migration 
distance x, calculate dRo/dx and then compute maturity influence index 
(Mil) from values of a 2i a$ and dRojdx, by using Equation (3); 

(4) If maturity influence index is <5%, the maturity influence may be ignored. If 
it is >5%, SMFI values of geotracers (such as carbazoles) are computed using 
Equation (4) with the data of concentrations, a\, a 2 and Ro, as shown in the 
case of the Xifeng Oilfield; 

(5) For carbazoles, calculate the geometric SMFI means of exposed, partially 
exposed and shielded DMCAs (dimethyl carbazoles), ratios of geometric 
means of SMFIs of N-H exposed/shielded, exposed/partially exposed, par- 
tially exposed/ shielded DMCAs, the corresponding SMFI ratios of individual 
dimethyl carbazoles and SMFI ratio of benzo[a]/benzo[c]carbazole; 

(6) Analyze the correlation of the SMFI values, geometric means and ratios 
against relative migration distance; if a correlation is evident, identify migra- 
tion fractions on the base of a 3 and the ratios calculated at the fifth step; if the 
correlation and migration fractions do not support a particular selected path- 
way being valid, other possible pathways should be investigated by going back 
to the second step; if migration fractionation exists, the migration pathway, 
distance and direction are confirmed further with comprehensive analysis of 
geological and geochemical data. 

In the case of large variations in maturity as shown in the Rimbey- 
Meadowbrook reef trend, a quadratic [a^Ro 2 ) needs to be added into the parentheses 



in Supplementary Equation (S12) and Equation (1), and Equations (2-4) are adjusted 
accordingly. 
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